skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Alexandrov, Boian"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract DNA breathing dynamics—transient base-pair opening and closing due to thermal fluctuations—are vital for processes like transcription, replication, and repair. Traditional models, such as the Extended Peyrard-Bishop-Dauxois (EPBD), provide insights into these dynamics but are computationally limited for long sequences. We presentJAX-EPBD, a high-throughput Langevin molecular dynamics framework leveragingJAXfor GPU-accelerated simulations, achieving up to 30x speedup and superior scalability compared to the original C-based EPBD implementation.JAX-EPBDefficiently captures time-dependent behaviors, including bubble lifetimes and base flipping kinetics, enabling genome-scale analyses. Applying it to transcription factor (TF) binding affinity prediction using SELEX datasets, we observed consistent improvements inR2values when incorporating breathing features with sequence data. Validating on the 77-bp AAV P5 promoter,JAX-EPBDrevealed sequence-specific differences in bubble dynamics correlating with transcriptional activity. These findings establishJAX-EPBDas a powerful and scalable tool for understanding DNA breathing dynamics and their role in gene regulation and transcription factor binding. 
    more » « less
    Free, publicly-accessible full text available December 12, 2025
  2. Abstract Simulating DNA breathing dynamics, for instance Extended Peyrard-Bishop-Dauxois (EPBD) model, across the entire human genome using traditional biophysical methods like pyDNA-EPBD is computationally prohibitive due to intensive techniques such as Markov Chain Monte Carlo (MCMC) and Langevin dynamics. To overcome this limitation, we propose a deep surrogate generative model utilizing a conditional Denoising Diffusion Probabilistic Model (DDPM) trained on DNA sequence-EPBD feature pairs. This surrogate model efficiently generates high-fidelity DNA breathing features conditioned on DNA sequences, reducing computational time from months to hours–a speedup of over 1000 times. By integrating these features into the EPBDxDNABERT-2 model, we enhance the accuracy of transcription factor (TF) binding site predictions. Experiments demonstrate that the surrogate-generated features perform comparably to those obtained from the original EPBD framework, validating the model’s efficacy and fidelity. This advancement enables real-time, genome-wide analyses, significantly accelerating genomic research and offering powerful tools for disease understanding and therapeutic development. 
    more » « less
    Free, publicly-accessible full text available December 10, 2025
  3. Large Language Models (LLMs) are pre-trained on large-scale corpora and excel in numerous general natural language processing (NLP) tasks, such as question answering (QA). Despite their advanced language capabilities, when it comes to domain-specific and knowledge-intensive tasks, LLMs suffer from hallucinations, knowledge cut-offs, and lack of knowledge attributions. Additionally, fine tuning LLMs' intrinsic knowledge to highly specific domains is an expensive and time consuming process. The retrieval-augmented generation (RAG) process has recently emerged as a method capable of optimization of LLM responses, by referencing them to a predetermined ontology. It was shown that using a Knowledge Graph (KG) ontology for RAG improves the QA accuracy, by taking into account relevant sub-graphs that preserve the information in a structured manner. In this paper, we introduce SMART-SLIC, a highly domain-specific LLM framework, that integrates RAG with KG and a vector store (VS) that store factual domain specific information. Importantly, to avoid hallucinations in the KG, we build these highly domain-specific KGs and VSs without the use of LLMs, but via NLP, data mining, and nonnegative tensor factorization with automatic model selection. Pairing our RAG with a domain-specific: (i) KG (containing structured information), and (ii) VS (containing unstructured information) enables the development of domain-specific chat-bots that attribute the source of information, mitigate hallucinations, lessen the need for fine-tuning, and excel in highly domain-specific question answering tasks. We pair SMART-SLIC with chain-of-thought prompting agents. The framework is designed to be generalizable to adapt to any specific or specialized domain. In this paper, we demonstrate the question answering capabilities of our framework on a corpus of scientific publications on malware analysis and anomaly detection. 
    more » « less
    Free, publicly-accessible full text available December 18, 2025
  4. API 5L Grade X65 steel pipes, internally clad alloy 625, are commonly utilized in pipelines and risers for subsea oil and gas extraction. Gird welds in such pipes are conventionally made using alloy 625 filler metal. However, alloy 625 weld metal cannot meet the base metal yield strength overmatching requirement for subsea reel lay installation. This study explored materials selection and process development for low-alloy steel girth welds in API 5L Grade X65 steel pipes, internally clad with alloy 625. Welding with a higher melting point filler metal over a lower melting substrate, i.e., low-alloy steel over Ni-based alloy, is impractical due to increased susceptibility to solidification cracking and solidification shrinkage porosity. Pseudo-binary phase diagrams developed for various combinations of low alloy steel filler metals and Ni-based alloy substrates identified good compatibility between ER80S-G filler metal and alloy 686. The solidification temperature range and the tendency for partitioning of alloying elements were significantly lower throughout the entire ER80S-G/alloy 686 dilution range than in the low alloy steel filler metals/alloy 625 combinations. Extensive process optimization effort to reduce the dilution of alloy 686 root pass in the low-alloy steel weld metal and avoid incomplete fusion defects allowed for the production of defect-free girth welds. These welds met the yield strength and ductility requirements for subsea reel lay installation of pipelines. Process optimization for bead tempering significantly narrowed the high hardness region in the ER80S-G/alloy 686 partially mixed zone. 
    more » « less
  5. This study addresses the limitations of cross weld tensile testing (CWTT) in quantifying local mechanical properties across microstructural and compositional gradients in dissimilar– and matching–filler metal welds. A digital image correlation (DIC) methodology was validated for application in CWTT by direct comparison of stress-strain curves generated using conventional and virtual DIC extensometers in tensile testing of homogeneous steel samples. DIC-instrumented CWTT of dissimilar weld metal Alloy 625 filler metal on F65 steel demonstrated capability in quantifying the local yield strength, strain-hardening kinetics, and strain at failure in the base metal, heat-affected zone (HAZ), fusion boundary (FB) region, and weld metal in dissimilar and matching filler metal welds. It was shown that the high strain-hardening capacity in Alloy 625 weld metal led to base metal failure in CWTT despite the lower Alloy 625 weld metal yield strength. It was also shown that DIC-instrumented CWTT can be used for determining weld metal undermatching and overmatching conditions in compositionally matching- and dissimilar-metal welds. Furthermore, by quantifying local strain distribution (both elastic and plastic) in the HAZ, FB region, and weld metal, DIC-instrumented CWTT provides an additional method for evaluating hydrogen-assisted cracking susceptibility in dissimilar-metal welds. 
    more » « less
  6. An externally restrained stress relief cracking test was developed and demonstrated in testing susceptible and resistant to cracking welds in Cr–Mo steels. Compared to other externally restrained tests, it simultaneously applies stress and compensates thermal expansion during heating to post-weld heat treatment temperature and utilises digital image correlation for quantification of key characteristics of the stress relaxation and stress relief cracking phenomena. In contrast with resistant to stress relief cracking materials, susceptible materials experienced lower levels of stress relaxation, strain absorption, and sustained mechanical energy, with accelerated kinetics of strain accumulation and strain localisation leading to failure. The processes of stress relief cracking and stress relaxation were quantified as low strain – slow strain rate – low energy phenomena. 
    more » « less
  7. The tempering response in the heat-affected zone (HAZ) of low alloy steels during temper bead welding is heavily dependent on the experienced thermal history. Past work has developed quantification approaches for isothermal tempering conditions and single non-isothermal tempering cycles, whereas the temper bead welding processes impart multiple non-isothermal cycles throughout the HAZ. This work outlines a novel methodology for tempering response quantification that allows for prediction of the HAZ hardness in multipass welding. The quantification approach utilizes a modification of the Grange-Baughman tempering parameter that converts non-isothermal cycles into an equivalent isothermal cycle and correlate this with the resulting hardness. This relationship can be utilized to evaluate hardness distributions throughout the HAZ of low alloy steel temper bead weldments based on the experienced thermal histories. It was shown that, in contrast with conventional heat treatment, the temper bead welding in Grade 22 steel results in nucleation of high density, finely dispersed Fe-Cr rich carbides. The proposed methodology was applied for evaluation of the HAZ hardness in a particular heat of Grade 22 steel, resulting from multiple tempering reheats, and was experimentally validated using a three-layer weld overlay. It was found that the peak temperature of weld tempering cycles was the most significant factor in controlling HAZ hardness. 
    more » « less
  8. A dissimilar weld between a low alloy steel (LAS) butter weld joined to a F65 steel pipe using a narrow groove hot wire gas tungsten arc welding (HW-GTAW) procedure with Alloy 625 filler metal was investigated. The weld interpass microstructure is comprised of large swirls formed by a macrosegregation mechanism involving partial, non-uniform mixing of liquid base metal with the lower melting temperature weld pool, followed by fast solidification. This mechanism produces steep gradients in composition and solidification behavior. The resulting swirls are composed of alternating iron-rich peninsulas and partially mixed zones (PMXZ) that are surrounded by planar and cellular zones exhibiting multiple solidification directions. Large austenitic grains, encompassing planar, cellular, and dendritic morphologies, nucleate off peninsulas in direct contact with the weld pool. The highest hardness was found in nickel and chromium rich PMXZs that exhibited a lath martensite microstructure. In the event of exposure to hydrogen containing environments, the PMXZs could serve as nucleation sites for hydrogen assisted cracking. 
    more » « less
  9. Finding the inherent organization in the structure space of a protein molecule is central in many computational studies of proteins. Grouping or clustering tertiary structures of a protein has been leveraged to build representations of the structure-energy landscape, highlight sta- ble and semi-stable structural states, support models of structural dy- namics, and connect them to biological function. Over the years, our laboratory has introduced methods to reveal structural states and build models of state-to-state protein dynamics. These methods have also been shown competitive for an orthogonal problem known as model selection, where model refers to a computed tertiary structure. Building on this work, in this paper we present a novel, tensor factorization-based method that doubles as a non-parametric clustering method. While the method has broad applicability, here we focus and demonstrate its efficacy on the estimation of model accuracy (EMA) problem. The method outperforms state-of-the-art methods, including single-model methods that leverage deep neural networks and domain-specific insight. 
    more » « less
  10. null (Ed.)